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Emotion recognition through computational modeling and analysis of physiological signals has been widely 
investigated in the last decade. Most of the proposed emotion recognition systems require relatively 
long-time series of multivariate records and do not provide accurate real-time characterizations using 
short-time series. To overcome these limitations, we propose a novel personalized probabilistic framework 
able to characterize the emotional state of a subject through the analysis of heartbeat dynamics exclusively. 
The study includes thirty subjects presented with a set of standardized images gathered from the 
international affective picture system, alternating levels of arousal and valence. Due to the intrinsic 
nonlinearity and nonstationarity of the RR interval series, a specific point-process model was devised for 
instantaneous identification considering autoregressive nonlinearities up to the third-order according to 
the Wiener- Volterra representation, thus tracking very fast stimulus-response changes. Features from the 
instantaneous spectrum and bispectrum, as well as the dominant Lyapunov exponent, were extracted and 
considered as input features to a support vector machine for classification. Results, estimating emotions 
each 10 seconds, achieve an overall accuracy in recognizing four emotional states based on the circumplex 
model of affect of 79.29%, with 79.15% on the valence axis, and 83.55% on the arousal axis. 



The detection and recognition of emotional information is an important topic in the field of affective 
computing, i.e. the study of human affects by technological systems and devices'. Changes in emotional 
states often reflect facial, vocal, and gestural modifications in order to communicate, sometimes sub- 
unconsciously, personal feelings to other people. Such changes can be generalized across cultures, e.g. nonverbal 
emotional, or can be culture-specific^. Since mood alteration strongly affects the normal emotional process, 
emotion recognition is also an ambitious objective in the field of mood disorder psychopathology. In the last 
decade, several efforts have tried to obtain a reliable methodology to automatically identify the emotional/mood 
state of a subject, starting from the analysis of facial expressions, behavioral correlates, and physiological signals. 
Despite such efforts, current practices still use simple mood questionnaires or interviews for emotional assess- 
ment. In mental care, for instance, the diagnosis of pathological emotional fluctuations is mainly made through 
the physician's experience. Several epidemiological studies report that more than two million Americans have 
been diagnosed with bipolar disorder^, and about 82.7 million of the adult European population from 18 to 65 
years of age, have been affected by at least one mental disorder*. Several computational methods for emotion 
recognition based on variables associated with the Central Nervous System (CNS), for example the 
Electroencephalogram (EEC), have been recently proposed^ '^. These methods are justified by the fact that 
human emotions originate in the cerebral cortex involving several areas for their regulation and feeling. The 
prefrontal cortex and amygdala, in fact, represent the essence of two specific pathways: affective elicitations longer 
than 6 seconds allow the prefrontal cortex to encode the stimulus information and transmit it to other areas of the 
Central Autonomic Network (CAN) to the brainstem, thus producing a context appropriate response"; briefly 
presented stimuli access the fast route of emotion recognition via the amygdala. Of note, it has been found that the 
visual cortex is involved in emotional reactions to different classes of stimuli'*. Dysfunctions on these CNS 
recruitment circuits lead to pathological effects'^ such as anhedonia, i.e. the loss of pleasure or interest in 
previously rewarding stimuli, which is a core feature of major depression and other serious mood disorders. 
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Figure 1 A graphical representation of the circumplex model of affect with the horizontal axis representing the valence or pleasant dimension and the 
vertical axis representing the arousal or activation dimension'". 



Given the CAN involvement in emotional responses, an important 
direction for affective computing studies is related to changes of the 
Autonomic Nervous System (ANS) activity as elicited by specific 
emotional states. Monitoring physiological variables linked to ANS 
activity, in fact, can be easily performed through wearable systems, 
e.g. sensorized t-shirts""'" or gloves"*''"'. Its dynamics is thought to 
be less sensitive to artifacting events than in the EEG case. 
Moreover, the human vagus nerve is anatomically linked to the 
cranial nerves that regulate social engagement via facial expression 
and vocalization. Engineering approaches to assess ANS patterns 
related to emotions constitute a relevant part of the state-of-the-art 
methods used in affective computing. For example, a recent review 
written by Calvo et al.''^ reports on emotion theories as well as on 
affect detection systems using physiological and speech signals 
(also reviewed in^°), face expression and movement analysis. Long 
multivariate recordings are currently needed to accurately character- 
ize the emotional state of a subject. Such a constraint surely reduces 
the potential wide spectrum of real applications due to computa- 
tional cost and number of sensors. More recently, EGG morpho- 
logical analysis by Hilbert-Huang transform^', mutual information 
analysis of respiratory signals^^, and a multiparametric approach 
related to ANS activity^' have been proposed to assess human affect- 
ive states. 

Experimental evidence over the past two decades shows that Heart 
Rate Variability (HRV) analysis, in both time and frequency domain, 
can provide a unique, noninvasive assessment of autonomic func- 
tion^'*'^^''"'. Nevertheless, HRV analysis by means of standard proce- 
dures presents several limitations when high time and frequency 
resolutions are needed, due mainly to associated inherent assump- 
tions of stationarity required to define most of the relevant HRV time 
and frequency domain indices^'*'^^. More importantly, standard 
methods are generally not suitable to provide accurate nonlinear 
measures in the absence of information regarding phase space fitting. 
It has been well-accepted by the scientific community that physio- 
logical models should be nonlinear in order to thoroughly describe 
the characteristics of such complex systems. Within the cardiovascu- 
lar system, the complex and nonstationary dynamics of heartbeat 
variations have been associated to nonlinear neural interactions 
and integrations occurring at the neuron and receptor levels, so that 
the sinoatrial node responds in a nonlinear way to the changing levels 
of efferent autonomic inputs"'^. In fact, HRV nonlinear measures 



have been demonstrated to be of prognostic value in aging and dis- 
eases^'*"^'"'^* '"'''". In several previous works^^ '", we have demonstrated 
how it is possible to estimate heartbeat dynamics in cardiovascular 
recordings under nonstationary conditions by means of the analysis 
of the probabilistic generative mechanism of the heartbeat. 
Goncerning emotion recognition, we recently demonstrated the 
important role of nonlinear dynamics for a correct arousal and val- 
ence recognition from ANS signals'*'' including a preliminary 
feasibility study on the dataset considered here'''. 

In the light of all these issues, we here propose a new methodology 
in the field of affective computing, able to recognize emotional 
swings (positive or negative), as well as two levels of arousal and 
valence (low-medium and medium-high), using only one biosignal, 
the EGG, and able to instantaneously assess the subject's state even in 
short-time events (<10 seconds). Emotions associated with a short- 
time stimulus are identified through a self-reported label as well as 
four specific regions in the arousal-valence orthogonal dimension 
(see Fig. 1). The proposed methodology is fuUy based on a persona- 
lized probabilistic point process nonlinear model. In general, we 
model the probability function of the next heartbeat given the past 
Revents. The probability function is fully parametrized, considering 
up to cubic autoregressive Wiener- Volterra relationship to model its 
first order moment. All the considered features are estimated by the 
linear, quadratic, and cubic coefficients of the linear and nonlinear 
terms of such a relationship. As a consequence, our model provides 
the unique opportunity to take into account all the possible linear 
and nonlinear features, which can be estimated from the model 
parameters. Importantly as the probability function is defined at each 
moment in time, the parameter estimation is performed instanta- 
neously, a feature not reliably accomplishable by using other more 
standard linear and nonlinear indices such as pNN50%, triangular 
index, RMSSD, Recurrence Quantification Analysis, etc.^'''^^'"'*. In 
particular, the linear terms allow for the instantaneous spectral 
estimation, the quadratic terms allow for the instantaneous bispectral 
estimation, whereas the dominant Lyapunov exponent can be 
defined by considering the cubic terms. Of note, the use of higher 
order statistics (HOS) to estimate our features is encouraged by the 
fact that quantification of HRV nonlinear dynamics play a crucial 
role in emotion recognition systems'''' '"''" extending the information 
given by spectral analysis, and providing useful information on the 
nonlinear frequency interactions. 
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Results 

Experimental protocol. The recording paradigm related to this work 
has been previously described in'*''''"'. We adopted a common 
dimensional model which uses multiple dimensions to categorize 
emotions, the Circumplex Model of Affects (CMA)''^ The CMA 
used in our experiment takes into account two main dimensions 
conceptualized by the terms of valence and arousal (see Fig. 1). 
Valence represents how much an emotion is perceived as positive 
or negative, whereas arousal indicates how strongly the emotion is 
felt. Accordingly, we employed visual stimuli belonging to an 
international standardized database having a specific emotional 
rating expressed in terms of valence and arousal. Specifically, we 
chose the International Affective Picture System (IAPS)^°, which is 
one of the most frequently cited tools in the area of affective 
stimulation. The TAPS is a set of 944 images with emotional 
ratings based on several studies previously conducted where 
subjects were requested to rank these images using the self 
assessment manikin (both valence and arousal scales range from 0 
to 10). A general overview of the experimental protocol and analysis 
is shown in Fig. 2. The passive affective elicitation performed through 
the lAPS images stimulates several cortical areas also allowing the 
prefrontal cortex modulation generating cognitive perceptions". An 
homogeneous population of 30 healthy subjects (aged from 21 to 24), 
not suffering from both cardiovascular and evident mental 
pathologies, was recruited to participate in the experiment. The 
experimental protocol for this study was approved by the ethical 
committee of the University of Pisa and an informed consent was 
obtained from all participants involved in the experiment. All 
participants were screened by Patient Health Questionnaire™ 
(PHQ) and only participants with a score lower than 5 were 
included in the study^'. 

The affective elicitation was performed by projecting the lAPS 
images to a PC monitor. The slideshow was comprised of 9 image 
sessions, alternating neutral sessions and arousal sessions (see Fig. 3). 
The neutral sessions consist of 6 images having valence range (min = 
5.52, max = 7.08), and arousal range (min = 2.42, max = 3.22). The 
arousal sessions are divided into Low-Medium (L-M) and Medium- 
High (M-H) classes, according to the arousal score associated. Such 
sessions include 20 images eliciting an increasing level of valence 
(from unpleasant to pleasant). The L-M arousal sessions had a val- 
ence range (min = L95, max = 8.03), and an arousal range (min = 
3.08, max = 4.99). The M-H arousal sessions had a valence range 
(min = L49, max = 7.77), and an arousal range (min = 5.01, max = 
6.99). The overall protocol utilized 110 images. Each image was pre- 
sented for 10 seconds for a total duration of the experiment of 18 
minutes and 20 seconds. 

During the visual elicitation, the electrocardiogram (ECG) was 
acquired by using the ECGIOOC Electrocardiogram Amplifier from 
BIOPAC inc., with a sampling rate of 250 Hz. A block diagram of the 
proposed recognition system is illustrated in Fig. 4. In line with the 
CMA model, the combination of two levels of arousal and valence 
brings to the definition of four different emotional states. The stim- 
uli, with high and low arousal and high and low valence, produce 
changes in the ANS dynamics through both sympathetic and para- 
sympathetic pathways that can be tracked by a multidimensional 
representation estimated in continuous time by the proposed 
point-process model. The obtained features are then processed for 
classification by adopting a leave-one-out procedure. 

Algorithms. The ECG signal was analyzed off-line to extract the RR 
intervals^'', then further processed to correct for erroneous and 
ectopic beats by a previously developed algorithm"'^. The presence 
of nonlinear behaviors in such heartbeat series was tested by using a 
well-established time-domain test based on high-order statistics^''. 
The null hypothesis assumes that the time series are generated by a 
linear system. We set the number of laps to M = 8, and a total of 500 



bootstrap replications for every test. Experimental results are shown 
in Table 1. The nonlinearity test gave significant results (p < 0.05) on 
27 out of 30 subjects. In light of this result, we based our methodology 
on Nonlinear Autoregressive Integrative (NARI) models. Nonlinea- 
rities are intended as quadratic and cubic functions of the past RR 
intervals according to the Wiener- Volterra representation^*'^^. Major 
improvements of our approach rely on the possibility of performing a 
regression on the derivative RR series based on an Inverse Gaussian 
(IG) probability structure""' The quadratic nonlinearities contri- 
bute to the complete emotional assessment through features coming 
from the instantaneous spectrum and bispectrum"''''^'. It is worth- 
while noticing that our feature estimation is derived from an 
equivalent n'^-order inputoutput Wiener-Volterra model"'^'', thus 
allowing for the potential estimation of the fi'^-order polyspectra of 
the physiological signal""" (see Materials and Methods section for 
details). Moreover, by representing the RR series with cubic autore- 
gressive functions, it is possible to perform a further instantaneous 
nonlinear assessment of the complex cardiovascular dynamics and 
estimate the dominant Lyapunov exponent at each moment in 
time'"'. Indices from a representative subject are shown in Fig. 5. 
Importantly, the NARI model as applied to the considered data 
provides excellent results in terms of goodness-of-fit, and indepen- 
dence test, with KS distances never above 0.056. A comparison 
analysis was performed between the simple linear and NARI 
models considering the Sum of the Squared Distances (SSD) of the 
points outside the confidence interval of the autocorrelation plot (see 
Table 1). We report that nonlinear point-process models resulted in 
lower SSD on all the considered subjects. Further results reporting 
the number of points outside the confidence interval of the 
autocorrelation plot are shown in the Supporting Information. 

To summarize, the necessary algorithmic steps for the assessment 
of instantaneous ANS responses to short-time emotional stimuli are 
as follows: a) extract an artifact-free RR interval series from the ECG; 
b) use the autoregressive coefficients of the quadratic NARI expan- 
sion to extract the input-output kernels; c) estimate the instant- 
aneous spectral and bispectral features; d) use the autoregressive 
coefficients of the cubic NARI expansion and fast orthogonal search 
algorithm to estimate the instantaneous dominant Lyapunov expo- 
nent. All the extracted instantaneous features (see Materials and 
Methods) are used as input of the classification procedure described 
below. Of note, since no other comparable statistical models have 
been advocated for a similar application, goodness-of-fit and clas- 
sification performance of the proposed nonlinear approach are com- 
pared with its linear counterpart: the basic linear point-process 
model described in"*', is also used here for the first time for the 
proposed classification analysis. Clearly, in order to perform a fair 
comparison, the order of such a simple linear IG-based point-process 
model was chosen considering an equal number of parameters that 
need to be estimated for the nonlinear models. 

Moreover, here we report on the usability of other simple point- 
process models having the Poisson distribution as inter-beat pro- 
bability. Such a model gave poor performances in terms of both 
goodness-of-fit and resulting in sufficient reliability to solve the pro- 
posed classification problems. All of the algorithms were implemen- 
ted by using Matlab® R2013b endowed with self-made code and 
three additional toolboxes for pattern recognition and signal proces- 
sing, i.e. LIBSVM'^^, PRTooF^' and time series analysis toolbox*''. 

Classification. To perform pattern recognition of the elicited 
emotional states, a two-class problem was considered for the 
arousal, valence and self-reported emotion: Low-Medium (L-M) 
and Medium-High (M-H). The arousal classification was linked to 
the capability of our methodology in distinguishing the L-M arousal 
stimuli from the M-H ones, with the neutral sessions associated to 
the L-M arousal class. The overall protocol utilized 110 images. 
According to the scores associated to each image, for each subject. 
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Figure 2 | An overview of the experimental set-up and block scheme of the overall signal processing and classification chain. The central nervous system 
is emotionally stimulated through images gathered from the International Affective Picture System. Such a standardized dataset associates multiple scores 
to each picture quantifying the supposed elicited pleasantness (valence) and activation (arousal). Accordingly, the pictures are grouped into arousal and 
valence classes, including the neutral ones. During the slideshow each image stands for 10 seconds, activating the prefrontal cortex and other cortical areas, 
consequently producing the proper autonomic nervous system changes through both parasympathetic and sympathetic pathways. Starting from the ECG 
recordings, the RR interval series are extracted by using automatic R-peak detection algorithms applied on artifact-free ECG. The absence of both 
algorithmic errors (e.g., mis-detected peaks) or ectopic beats in such a signal is ensured by the application of effective artifact removal methods as well as 
visual inspection. The proposed point-process model is fitted on the RR interval series, and several features are estimated in an instantaneous fashion. 
Then, for each subject, a feature set is chosen and then split into training and test set for support vector machine-based classification. This image was 
drawn by G. Valenza, who holds both copyright and responsibility. 



the dataset was comprised of 64 examples for the L-M arousal class 
and 40 examples for the M-H arousal class. Regarding valence, we 
distinguished the L-M from the M-H valence regardless of the images 
belonging to the neutral classes. This choice is justified by the fact 
that the neutral images can be equally associated to the L-M or M-H 
valence classes. According to the experimental protocol timeline, for 



each subject, the dataset was comprised of 40 examples for the L-M 
valence class and 40 examples for the M-H valence class. For the self- 
reported emotions, we used labels given by the self-assessment 
manikin (SAM) report. After the visual elicitation, in fact, each 
subject was asked to fill out a SAM test associating either a positive 
or a negative emotion to each of the seen images. During this phase. 
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Figure 3 | Sequence scheme over time of image presentation in terms of 
arousal and valence levels. The y axis relates to the ofBcial lAPS score, 
whereas the x axis relates to the time. The neutral sessions, which are 
marked with blue lines, alternate with the arousal ones, which are marked 
with red staircases. Along the time, the red Kne foUows the four arousal 
sessions having increasing intensity of activation. The dotted green Une 
indicates the vfl/ence levels distinguishing the low-medium (L-M) and the 
medium-high (M-H) level within an arousing session. The neutral sessions 
are characterized by lowest arousal (<3) and medium valence scores 
(about 6). 

the images were presented in a different randomized order with 
respect to the previous sequence. 

As a further challenge, a more difficult four-class problem was 
solved by defining the recognition of four different emotional states 
associated with specific regions of the arousal-valence plane. 
Specifically, given a selective combination of the L-M and M-H levels 
of the arousal and valence CMA dimensions, the following four 
emotional states were distinguished: "sadness", as a simultaneous 
effect of L-M valence and L-M arousal; "anger", as a simultaneous 
effect of L-M valence and M-H arousal; "happiness", as a simultan- 
eous effect of M-H valence and M-H arousal; "relaxation", as a 
simultaneous effect of M-H valence and L-M arousal (see Fig. 1). 

For each of the mentioned classification problems, a leave-one-out 
procedure" was performed on the N available features using a well- 
known Support Vector Machine (SVM)"" as pattern recognition 



algorithm. Specifically, we used a nu-SVM (nu = 0.5) having a radial 

basis kernel function K{xi,Xj) =exp^ — y||x,— with y = N~\ 

xe?Si^ as the training vectors. Results gathered from SVM classifiers 
are summarized in Tables 1 and 2 and reported as recognition accu- 
racy, i.e. the percentage of correct classification among all classes for 
each subject and for each of the emotion classification cases. 

All the estimated HRV indices coming from the NARI models 
were processed in order to define the whole feature set Fjvar/- Such 
a whole feature set is comprised of five linear-derived and eleven 
nonlinear-derived features. In this case, linear-derived means that 
such features, namely the mean and standard deviation of the IG 
distribution (corresponding to the instantaneous point process defi- 
nitions of mean and standard deviation of the RR intervals'"'), the 
power in the low frequency (LF) band, the power in the high fre- 
quency (HF) band, and the LF/HF ratio, are estimated having only 
knowledge of the linear coefficients of the models. Likewise, the non- 
linear-derived features are estimated from the quadratic and cubic 
terms of the NARI model. In particular, features coming from the 
instantaneous bispectral analysis, namely the mean and the standard 
deviation of the bispectral invariants, mean magnitude, phase 
entropy, normalized bispectral entropy, normalized bispectral 
squared entropy, sum of logarithmic bispectral amplitudes, and non- 
linear sympatho -vagal interactions, are derived from the quadratic 
formulation (see Supporting Information), whereas the dominant 
Lyapunov exponents analysis was performed engaging cubic nonli- 
nearities (see Materials and Methods). 

In order to enhance the personalization of the system as well as 
improve the recognition accuracy, given the complete set pNARh we 
further derived five subsets obtained by applying feature selection. 
Specifically, we defined the subset Fi j^ar/ having the linear-derived 
features only, the subset F2j^ari having the linear-derived along with 
the nonlinear-derived bispectral features, the subset F^j^ari having 
the linear- derived features alongside the dominant Lyapunov expo- 
nent, and the subset F4,nari having the five most informative features 
as quantified through the Fisher score"""*. Of note, the sets verify the 
following relationships: Fi^^ri C Fz^ari Fnaw; ^ i.nar7 F^j^^ri 
^ Fnari'-: F4j^ari C ^naju; firstly, we illustrate in Table 1 the clas- 
sification results by comparing the performances of the reference 
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Figure 4 | Logical scheme of the overall short-time emotion recognition concept. The autonomic nervous system acts on the cardiovascular system 
modulating its electrical activity. This activity affects the heartbeat dynamics, which can be non-invasively revealed by the analysis and modeling of the RR 
interval series. To perform this task, we propose to consider a point-process probability density function in order to characterize cardiovascular dynamics 
at each moment in time. In particular, we use Wiener-Volterra nonlinear autoregressive integrative functions to estimate quantitative tools such as 
spectrum and bispectrum from the linear and nonlinear terms, respectively. Given the instantaneous spectra and high-order spectra, several features are 
combined to define the feature set, which is the input of the personaUzed pattern recognition procedure. Support vector machines are engaged to perform 
this task by adopting a leave-one-out procedure. 



SCIENTIFIC REPORTS | 4:4998 | DDI: 1 0. 1 038/srep04998 



5 



Table 1 | Experimental results on the short-time emotional response using point-process models. Columns refer to: subject identifier, non- 
linearity test. Sum of the Squared Distances (SSD) of the points outside the confidence interval of the autocorrelation plot, and accuracy (in 
percentage) of SVM classification considering \}r\e arousal, valence, and self-reported emotion classification coses. For each classification, 
//near stands for accuracies given by the simple linear point-process model, whereas non//near stands for accuracies given by nonlinear 
models. The ticks indicate improvements given by the use of point-process nonlinear models versus their linear counterparts 

Autocorrelation (SSD) Arousal Valence Emotion 
Subjects p-value* Linear— > Nonlinear Linear^ Nonlinear Linear^ Nonlinear Linear^ Nonlinear 


1 




<10-' 


0.3103^0.0718,/ 


81.48^84.15,/ 


70.00^76.54,/ 


68.12^70.00) 


2 




<io-^ 


0.2166^0.0722,/ 


72.29^ 80.76,/ 


70.73^ 83.33) 


66.29^75.29) 


3 




<10-' 


0.2576^0.0001,/ 


79.27^ 89.02,/ 


65.43^ 82.72,/ 


67.47^72.29) 


4 




<0.01 


0.1842^0.0656,/ 


71.95-* 85.36) 


69.14^77.78,/ 


56.82^70.45) 


5 




<0.05 


0.1327^0.0437,/ 


85.36^84.15 


61.73^69.13,/ 


78.57^62.86 


6 




<0.005 


0.1031 ^0.0124,/ 


75.61 ^86.42) 


92.59^ 85.00 


70.73 ^ 66.67 


7 




<0.05 


0.0925^0.0229,/ 


90.12^94.44) 


80.00^84.42) 


56.25^72.22) 


8 




<0.002 


0.251 1 ^0.0280,/ 


85.18^87.65,/ 


76.25^76.25 


66.10^69.49) 


9 




<0.05 


0.1661 ^0.0701,/ 


62.20 -> 70.37) 


58.02^73.75) 


58.33 -> 60.00) 


10 




<10-' 


0.1220^0.0338,/ 


79.27^85.00) 


60.49^65.38) 


55.56^65.38) 


11 




<0.03 


0.1028^0.0457,/ 


76.82^76.54 


72.84^75.00) 


73.24^75.71) 


12 




<0.02 


0.2058^ 0.0143,^ 


84.15^86.42,/ 


79.01 ^80.00) 


77.38^74.70 


13 




<0.004 


0. 1205^ 0.0501 v' 


81.70^78.05 


59.26^75.31) 


62.03^73.42) 


14 




<0.002 


0.2529^0.0160,/ 


90.24^ 82.93 


79.01 ^ 86.42) 


74.67^ 84.00) 


15 




<0.004 


0.1242^0.0219,/ 


86.58^ 87.80,/ 


79.01 ^75.31 


68.12^72.46) 


16 




<0.008 


0.1998^ 0.0660,/ 


69.51 ^69.51 


77.77 ^75.3^ 


67.61 63.38 


17 




>0.05 


0.2597^0.0001,/ 


86.58^92.21) 


82.72^88.31) 


82.14^ 82.27) 


18 




<0.03 


0.2492^0.0549,/ 


82.93^ 86.59) 


86.42^ 87.65) 


77.91 ^79.07) 


19 




<10-' 


0.1 188^0.0243,/ 


75.60^74.39 


80.24^79.01 


64.47^65.79) 


20 




<0.002 


0.1529^0.0436,/ 


71.95^ 82.93,/ 


64.20^77.78) 


60.23^70.45) 


21 




<0.01 


0.1513^0.0501,/ 


71.95^ 86.59,/ 


69.14^ 81.48) 


70.83^68.06 


22 




>0.05 


0.2095^ 0.01 26v' 


93.90^95.12,/ 


79.01 ^76.54 


62.16^70.27) 


23 




<0.05 


0.3251 ^0.0718,/ 


85.37^ 87.80) 


90.12^93.83) 


86.15^ 89.23) 


24 




<0.01 


0.0904^0.0165,/ 


69.51 -» 84.15,/ 


74.07^ 80.25) 


69.32-^75.00) 


25 




<0.05 


0.0933^0.0509,/ 


75.31 80.25,/ 


66.25^ 82.5) 


60.00-* 62.67) 


26 




<10-' 


0.1068^0.0185,/ 


64.55^78.48,/ 


62.82^69.23) 


56.00^62.67) 


27 




>0.05 


0.0675 ^ 0.0408,/ 


79.27^ 85.37,/ 


77.78-^77.78 


72.41 ^74.71) 


28 




<10-' 


0.0903 ^ 0.0470,/ 


81.70^ 82.7l) 


69.14^ 80.00) 


62.86^72.46) 


29 




<10-' 


0.2272 -» 0.01 72,/ 


80.25^ 85.19,/ 


75.00^78.75) 


62.50^68.75) 


30 




<io-* 


0.1779^0.0136,/ 


72.75^76.25,/ 


62.5^79.75) 


61.63^73.26) 



linear point-process model with the NARI model with feature set 
giving the best recognition accuracy for each subject. The recognition 
accuracy of the short-term positive-negative self-reported emotions 
improves with the use of the nonlinear measures in 25 cases, with 
>60% of successfully recognized samples for all of the subjects and a 
maximum of 89.23% for subject 23. In 19 out of the 30 subjects the 
accuracies are >70% with a total average accuracy of 71.43% 
Concerning the L-M and M-H arousal classification, the recognition 
accuracy of the short-term emotional data improves in 24 cases, with 
>69.51% of successfully recognized samples for all of the subjects 
and a maximum of 95.12% for subject 22. In 29 out of the 30 subjects 
the accuracies are >70%, 23 of which are above 80%. The total 
average accuracy is 83.55%. Concerning the L-M and M-H valence 
classification, the recognition accuracy of the short-term emotional 
data is unproved in 23 cases, with >65.38% of successfully recog- 
nized samples for all of the subjects and a maximum of 93.83% for 
subject 23. In 27 out of the 30 subjects the accuracy is >70%, 13 of 
which are above 80%. The total average accuracy is 79.15%. 

Moreover, we show in Table 2 all the results obtained using the 
complete feature set Fnari- Although such a set might be not opti- 
mized, this choice allows for practical affective computing applica- 
tions, possibly in real-time, retaining all of the information from both 
linear and nonlinear characteristics of the heartbeat complex 
dynamics. Alongside the separate arousal, valence, and self-reported 
emotions, we also show the accuracy in recognizing the four emo- 
tional states defined in the arousal-valence plane. In this case, the 
recognition accuracy is >67.24% of successfully recognized samples 
for all subjects, with best performance for subject 23 (94.83%). In 27 



out of the 30 subjects the accuracy is >70%, 14 of which are above 
80%. The total average accuracy is 79.29%. 

As a graphical representation of the results. Fig. 6 shows the com- 
plementary specificity-sensitivity plots for the three emotion clas- 
sification cases. The complementary specificity is defined as 
(1-specificity), corresponding to the false positive rate. The area of 
the maximum rectangle that can be drawn within the lower-right 
unit plane not including any of the points (each representing the 
result from one subject) can be defined as the "maximum area under 
the points" (MAUP): the higher the MAUP, the greater the perform- 
ance. The comparative evaluation of the maximum area under the 
points reveals that the point-process nonlinear (NARI) features give 
the best results on all the three emotion classification cases. 

Other classification algorithms. The performances given by the 
proposed point process NARI model were also tested using the 
following six classification algorithms'"': Linear Discriminant 
Classifier (LDC), Quadratic Discriminant Classifier (QDC), K- 
Nearest Neighborhood (KNN), MultiLayer Perceptron (MLP), 
Probabilistic Neural Network (PNN), Vector Distance Classifier 
(VDC). LDC and QDC are simple discriminant classifiers, i.e., 
there is no linear or nonlinear mapping of the feature set. 
Specifically, LDC finds the linear function that better discerns the 
feature space, whereas QDC finds the quadratic, thus nonlinear, one. 
MLP and PNN are common neural network-based algorithms 
performing a nonlinear mapping of the feature space and are able 
to find nonlinear functions discriminating the classes. KNN is a 
clustering algorithm through which the test set is associated to the 
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Figure 5 | Instantaneous tracking of the HRV indices computed from a representative subject using the proposed NARI model during the passive 
emotional elicitation (two neutral sessions alternated to a L-M and a M-H arousal session). In the first panel, the estimated ;iRR(t, Hi, {(f)) is 
superimposed on the recorded R-R series. Below, the instantaneous heartbeat power spectra evaluated in Low frequency (LF) and in High frequency (HF), 
the sympatho-vagal balance (LF/HF), several bispectral statistics such as the nonlinear sympathovagal interactions LL, LH, and HH, and the 
instantaneous dominant Lyapunov exponent (IDLE) are reported. In the three bottom panels, the tracking of emotion classification is shown in terms of 
arousal (A), valence (V), and their combination according to the circumplex model of affect (see Fig. 1). In such panels, the correct image classification is 
marked in red, whereas the misclassification is marked in blue. The neutral sessions are associated to the L-M arousal class exclusively without related 
valence class. This choice is justified as neutral stimuli could be arbitrarily associated to the L-M or M-H valence. 



most frequent class of the K nearest training examples. VDC is a 
KNNwithK = 1. 

Results showed that on the arousal classification, considering fea- 
tures coming from the point process NARI model the recognition 
accuracy of the short-term emotional data is improved as follows: 21 
cases using LDC, 17 cases using QDC, 23 cases using KNN, MLP, 
PNN, and VDC, among the total 30 cases. On the valence classifica- 
tion, considering features coming from the point process NARI 
model the recognition accuracy of the short-term emotional data is 
improved as follows: 24 cases using LDC, 18 cases using QDC and 
KNN, 27 cases using MLP, 22 cases using PNN, and 23 cases using 
VDC, among the total 30 cases. On the self-reported emotion clas- 
sification, considering features coming from the point process NARI 
model the recognition accuracy of the short-term emotional data is 
improved as follows: 21 cases using LDC, 18 cases using QDC and 
PNN, 22 cases using KNN, 24 cases using MLP, and 19 cases using 
VDC, among the total 30 cases. 

We also report that SVM algorithms always have the best recog- 
nition accuracy. 

Discussion 

We have presented a novel methodology able to assess in an instant- 
aneous, personalized, and automatic fashion whether the subject is 
experiencing a positive or a negative emotion. Assessments are per- 
formed considering only the cardiovascular dynamics through the 
RR interval series on short- time emotional stimuli (< 10 seconds). To 



achieve such results, we defined an ad-hoc mathematical framework 
based on point-process theory. The point-process framework is able 
to parametrize heartbeat dynamics continuously without using any 
interpolation methods. Therefore, instantaneous measures of HR 
and HRV"*'''" for robust short-time emotion recognition are made 
possible by the definition of a physiologically-plausible probabilistic 
model. An innovative aspect of the methodological approach is 
also the use of the derivative RR series'"''^" to fit the model. This choice 
allowed us to remarkably improve the tracking of the affec- 
tive-related, non-stationary heartbeat dynamics. The novel fully- 
autoregressive structure of our model accounts for the pioneering 
short-time affective characterization having knowledge related to 
both linear and nonlinear heartbeat dynamics. In fact, the quadratic 
and cubic autoregressive nonlinearities instantaneously associated to 
estimate the most likely heartbeat accounts for the estimation of high 
order statistics, such as the instantaneous bispectrum and trispec- 
tum, as well as significant complexity measures such as the instant- 
aneous Lyapunov exponents. These novel instantaneous features, 
together with an effective protocol for emotion elicitation and the 
support of machine learning algorithms, allowed to successfully solve 
a comprehensive four-class problem recognizing affective states such 
as sadness, anger, happiness, and relaxation. 

Unlike other paradigms developed in the literature for estimating 
human emotional states", our approach is purely parametric and the 
analytically- derived indices can be evaluated in a dynamic and 
instantaneous fashion. The proposed parametric model requires a 
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Table 2 | Experimental results on the short-time emotional response using NARI point-process models implementing quadratic and cubic 
autoregressive functions. For each subject, the accuracy of SVM classification is expressed in percentage considering the arousal, valence, 
and self-reported emotion classification cases. The accuracy on the 4-class emotion classification case is shown as "Emotion on Arousal- 
Valence Plane" 

Subjects Emotion on Arousal-Valence Plane Arousal Valence Emotion Self-Report 
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75.86 


84.15 


75.31 


70.00 
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78.19 


80.77 


83.33 


68.24 


3 


77.57 


87.80 


70.37 


63.86 
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79.31 


85.37 


77.78 


67.05 
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67.24 


84.15 


66.67 


61.43 


6 


89.47 


86.42 


83.75 


66.67 
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81.48 


91.67 


76.62 


72.23 
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73.68 


87.65 


76.25 


64.41 
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84.21 
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73.75 


55.00 


10 
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58.97 


60.26 
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87.72 
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80.00 


71.08 


13 


67.24 
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75.31 


70.89 


14 


82.76 


82.93 
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82.67 


15 
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69.57 


16 
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79.01 
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20 
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82.93 


76.54 


67.05 


21 
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85.37 


81.48 


68.06 


22 


77.59 


89.02 


74.07 


70.27 


23 


94.83 


80.49 


90.12 


86.15 


24 


87.93 


84.15 


80.25 


75.00 


25 


84.21 


75.31 


68.75 


53.33 


26 


76.36 


75.95 


67.94 


54.67 


27 


70.69 


80.49 


77.78 


60.92 


28 


71.93 


77.78 


76.25 


71.01 


29 


84.21 


83.95 


78.75 


66.25 


30 


80.36 


68.75 


79.75 


65.12 



preliminary calibration phase before it can be effectively used on a 
new subject. As the methodology is personalized, in fact, the model 
and classifier parameters have to be estimated using HRV data gath- 
ered during standardized affective elicitations whose arousal and 



valence scores lie on the whole CMA space. Moreover, a fine tuning 
of some of the model variables, e.g. the sliding time window W and 
the model linear and nonlinear orders, could be necessary to max- 
imize the performance. Of note, considering short-time RR interval 
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Figure 6 | Complementary Specificity-Sensitivity Plots (CSSPs) for tfie three emotion classification cases. In all panels, each blue circle represents a 
subject of the studied population, the x-axis represents the quantity 1 -specificity (i.e. false positive rate), and the y-axis represents the sensitivity (i.e. true 
positive rate). The gray rectangles define the maximum area under the points. Numbers inside the rectangles indicate the maximum area under the points 
expressed as percentage of the unit panel area. On top, from left to right, the three panels represent the CSSPs obtained using features from the point 
process NARI model and considering the arousal, valence, and self-reported emotion classification cases. Likewise, on the bottom, the three panels 
represent the CSSPs obtained using features from a point process linear model. 
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series gathered from 10 seconds of acquisition, currently used stand- 
ard signal processing methods would be unable to give reliable and 
effective results because of resolution or estimation problems. In 
particular, parametric spectral estimation such as traditional auto- 
regressive models would require interpolation techniques and would 
not give a goodness-of-fit measure, thus making the parameter 
estimation simply unreliable. In addition, non-parametric methods 
such as periodogram or Welch spectral estimation require stationar- 
ity and, considering the 10 second time window of interest, would 
give a frequency resolution of only 10"' Hz which is not sufficient to 
reliably assess spectral powers up to the low frequency spectral com- 
ponents (0.04-0.15 Hz). Fast emotional responses, such as the ones 
considered in several recent studies"'"'^'''", call for estimation algo- 
rithms that can perform a reliable emotional assessment at time 
resolution as low as 2 seconds. In such cases, our methodology would 
be able to accurately characterize the emotional response by provid- 
ing critical high resolution renditions. Accordingly, the methodology 
proposed here represents a pioneering approach in the current lit- 
erature and can open new avenues in the field of affective computing, 
clinical psychology, and neuro-economics. 

Taking advantage of the standard subdivision in LF and HF fre- 
quency ranges in the bispectral domain (see Fig. 1 in the supporting 
information), our method introduces novel nonlinear indices of 
heartbeat dynamics directly related to higher order interactions 
between faster (vagal) and slower (sympatho-vagal) heartbeat varia- 
tions, thus offering a new perspective into more complex autonomic 
dynamics. It has been demonstrated, in fact, that the cardiac and 
respiratory system exhibit complex dynamics that are further influ- 
enced by intrinsic feedback mechanisms controlling their inter- 
action^'. Cardio-respiratory phase synchronization represents an 
important phenomenon that responds to changes in physiological 
states and conditions, e.g. arousal visual elicitation'"*, indicating that 
it is strongly influenced by the sympatho-vagal balance^^. We pointed 
out that lower performance was reported in seven subjects when 
using nonlinear models. This outcome is neither correlated with 
goodness-of-fit measures nor with the nonlinearity test results on 
the RR interval series. Possible explanations may span from con- 
founding factors due to the fast protocol pace, to intrinsic individual 
characteristics of the emotional reaction to stimuli, or simple non- 
reported distraction from the task by the subject. Further investi- 
gations on these factors wUl be the subject of future studies. Since the 
proposed point-process framework allows the inclusion of physio- 
logical covariates such as respiration or blood pressure measures, 
future work will focus on further multivariate estimates of instant- 
aneous indices such as features from the dynamic cross-spectrum, 
cross-bispectrum^'', respiratory sinus arrhythmia"", and baroreflex 
sensitivity^"" in order to better characterize and understand the 
human emotional states in short-time events. We will also direct 
our efforts in applying the algorithms to a wide range of experimental 
protocols in order to validate our tools for underlying patho-physi- 
ology evaluation, as well as explore new applications on emotion 
recognition that consider a wider spectrum of emotional states. 

The primary impacts are in the affective computing field and in all 
the applications using emotion recognition systems. Our methodo- 
logy, in fact, is able to assess the personal cognitive association related 
to a positive and a negative emotion with very satisfactory results. 
The emotional model used in this work, i.e., the CMA, accounts for 
the emotional representation by means of the selective estimation of 
arousal and valence. Although the recognition accuracy proposed in 
this work relates to only two levels of arousal, valence and self- 
reported emotion, oversimplifying the complete characterization of 
the affective state of a subject, the emotional assessment in short-time 
events using cardiovascular information only is a very challenging 
task never solved before. Using only heartbeat dynamics, we effec- 
tively distinguished between the two basic levels of both arousal and 
valence, thus allowing for the assessment of four basic emotions. An 



important advantage is that the proposed framework is fully perso- 
nalized, i.e. it does not require data from a representative population 
of subjects. These achievements could have highly relevant impacts 
also in mood disorder psychopathology diagnosis and treatment, 
since the mood disorder produces an altered emotional response. 
Hence, monitoring fast emotional response in terms of stimulation 
time could make a continuous evaluation of disorder progression 
possible. Presently, emotional state is determined in a clinical setting 
using questionnaires with limited accuracy and quantitative power. It 
would be desirable to develop an automated system to determine and 
to quantify the emotional state. Doing this using a noninvasive 
physiological and easy-to-monitor signal such as the electrocardio- 
gram, it would represent an important scientific advancement. From 
a physiological perspective, the inherent nonlinearties of the cardio- 
vascular systems (e.g. the nonlinear neural signaling on the sinoatrial 
node''^) have been also confirmed by our experimental results. 
According to the nonlinearity tests^^, in fact, 27 out of the 30 RR 
series resulted to be the outputs of a nonlinear system. Of note, the 
results from goodness-of-fit tests were all positive, demonstrating 
that the proposed NARI model always performs a good prediction 
of the nonlinear heartbeat dynamics. We recently demonstrated the 
crucial role played by nonlinear dynamics in arousal and valence 
recognition starting from ANS signals'*''. In agreement with these 
previous results, here we have introduced instantaneous nonlinear 
features and improved the accuracy in the majority of the population 
(23 subjects) over the (also novel) results from the respective linear 
estimation. This finding is consistent with all the tested classification 
algorithms. A speculative explanation on the better performances 
obtained through ANS linear dynamics (observed in 7 subjects) 
can be related to the functional characteristics of the dynamical non- 
linear vago-sympathetic interaction, i.e., the so-called bidirectional 
augmentation^'. It is well-known, in fact, that the ANS control on the 
cardiovascular dynamics is based on the simultaneous activity and 
interaction of the sympathetic and vagal nerves. Sunagawa et al.^' 
showed that a perfect balance of these two neural activities leads to 
a linear control of the mean heart rate with a high potential dynam- 
ical gain (variance). On the other hand, when the sympathetic activ- 
ity prevails on the parasympathetic one and vice versa, the 
cardiovascular control becomes nonlinear with a reduced dynamical 
gain. Changes on these time-varying nonlinear vago-sympathetic 
interactions follows a sigmoidal relationship between autonomic 
nerve activity and heart rate. Accordingly, it could be possible to 
hypothesize that the vago-sympathetic interactions of those 7 sub- 
jects showing a more linear ANS dynamics, when elicited with emo- 
tional images, are somehow more balanced than in the other subjects. 

We also remark on the great improvement, over the random guess 
of 25%, performed by our methodology in recognizing four different 
emotional states during short-time stimuli. Such an achievement 
surely increases the impact in applicative fields such as affective 
computing, clinical psycology and neuro-economics. Moreover, 
our experimental results support previous studies where instant- 
aneous HRV indices extracted by means of a point process model 
provided a set of dynamic signatures able to characterize the dynamic 
range of cardiovascular responses under different physiological con- 
ditions^'. Therefore, the novel instantaneous nonlinear features 
could provide better assessment and improved reliability during such 
physiological responses. 

Methods 

The experimental protocol for this study was approved by the ethical committee of 
the University of Pisa and an informed consent was obtained from aU participants 
involved in the experiment. 

Nonlinear system identification and nonlinear autoregressive models. A 

Nonlinear Autoregressive iVIodel (NAR) Model can be expressed, in a general form, as 
follows: 
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y{k) = ¥{y{k-l),y{k-2), . . . ,y{k-M)) + c{k). 



(1) 



Considering c{k) as independent, identically distributed Gaussian random variables, 
such a model can be can be written as a Taylor expansion: 



:>'(*^)=>o+X]7i(!W/c-') 

1 = 1 

„ — -i;.—-{ ; — 1 J 



(2) 



The autoregressive structure of (2) allows for system identification with only exact 
knowledge on output data and with only a few assumptions on input data. We 
represent here the nonlinear physiological system by using nonlinear kernels up to the 
second order, i.e. )'o, >'i(0> yiiuj), and take into account the series of the 
derivatives in order to improve stationarity^^ ™. Hence, the general quadratic form of a 
Nonlinear Autoregressive Integrative (NARI) model becomes: 



Since f{t\Ht,Q{^)) indicates the probability of having a beat at time t given that a 
previous beat has occurred at Up ii^{tji.i,c{t)) can be interpreted as the expected 
waiting time until the next event could occur. The use of an inverse Gaussian 
distribution/(i|?^,,(;(^))i characterized at each moment in time, is motivated both 
physiologically (the integrate-and-fire initiating the cardiac contraction^^) and by 
goodness-of-fit comparisons^^. In previous work^^'^^, the instantaneous mean 
/ijy{(f,?^j,c(f)) was expressed as a linear and quadratic combination of present and 
past R-R intervals, based on a nonlinear Volterra- Wiener expansion*". Here we 
propose the novel NARI formulation in which the instantaneous RR mean is defined 



A/RR(f,?^„^(f))-RRN(o+>'o(0 



■ j^]'l(''0 fRRN(0-i~^^N(i)-<-l) 
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(3) 



where Ay{k - i) - y{k - i) - y{k - i - I) and Ay{k - j) ^ y{k - j) - y{k - j - 1). 
The quadratic kernel y2{Uj) is assumed to be symmetric. We also define the extended 
kernels y'i{i) and y'lii.j) as: 



7i(i) 



y'liij)- 



1, if 1 = 0 

-yi(i) if l<i<M 



if ij = 0 A i+j<M 
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(4) 



(5) 



and link the NARI model to a general input-output form, here defined by using the 
well-known Wiener- Volterra^'* series: 



y{k) = ho+J2h,{i)Ac{k-i) 



■J2J2---J2hAh,...J„)n&c{k-i,). 



(6) 



where the functions /t„(ti, . . ., t„) are the Volterra kernels. Mapping a quadratic NARI 
model to an n-th order input-output modeP" allows, after the input-output 
trasformation of the kernels, the evaluation of all the High Order statistics (HOS) of 
the system, such as the Dynamic Bispectrum and Trispectrum^^'^'*. In the next 
paragraph, the definition of the generic variable y{k) is used to represent the first- 
moment statistic (mean) of the probabilistic generative mechanism of the heart rate. 
In other words, the generic formulation of the NARI model parametrizes the 
autonomic control on the cardiovascular system using linear and nonlinear terms 
according to the Wiener- Volterra representation and point-process theory. 

Point-process nonlinear model of the heartbeat. The point process framework 
primarily defines the probability of having a heartbeat event at each moment in time. 
Such a probability accounts for the instantaneous estimation of features which are 
sensible to short-time emotional variations. Formally, defining t ^ {0, T], the 
observation interval, and 0 < Wi < ■ ■ ■ <Uk< u^^ i < ■ ■ ■ <Uk<T the times of the 
events, we can define N{t) — max{fc : u/^^ t} as the sample path of the associated 
counting process. Its differential, dN{t), denotes a continuous -time indicator 
function, where dN{t) = 1 when there is an event (the ventricular contraction), or 
dN{t) — 0 otherwise. The lefl: continuous sample path is defined as 

N{t) — limz^t- N(t) — max{fc : Uk<t}. Given the R-wave events {w^jj^^ detected 
fi^om the EGG, RRy ~ Uj — Uj-i > 0 denotes the j"" RR interval. Assuming history 
dependence, the inverse Gaussian probability distribution of the waiting time t — Uj 
until the next R-wave event is^^: 



.n (^RRfj(,)_,^-RR^(,)_,^_ij 

In this work, we consider nonlinearities associated to eq. 8 up to the third-order. 
Cubic terms, in fact, allow for the estimation of the dominant Lyapunov exponent, 
whereas quadratic terms account for the high-order spectral estimation. Considering 
the derivative RR interval series improves the achievement of stationarity within the 
sliding time window W (in this work we have chosen W — 70 seconds)^^. Since 
l.i^{tji.i,c{t)) is defined in continuous time, we can obtain an instantaneous RR 
mean estimate at a very fine timescale (with an arbitrarily small bin size A), which 
requires no interpolation between the arrival times of two beats. Given the proposed 
parametric model, all linear and nonlinear indices are defined as a time-varying 
function of the parameters ^{t) - [Co(0. To(f). y«('i. 0] with n - {1, 2, 3}. The 

unknown time- varying parameter vector ^{t) is estimated by means of a local 
maximum likelihood method^'-^^-^^. The model goodness-of-fit is based on the 
Kolmogorov-Smirnov (KS) test and associated KS statistics (see details in^"'). 
Autocorrelation plots are considered to test the independence of the model- 
transformed intervals^^. Optimal order of linear and nonlinear regressions is 
determined by prefitting the point process model to a subset of the data^^-^^ and 
comparing scores from KS tests, autocorrelation plots, and the Akaike Information 
Criterion (AIC) which is defined as follows: 



A7C- -2L + 2dim(c 



(9) 



where L is the maximized value of the likelihood function for the estimated model, 
dim(c) is the number of parameters in the NARI model. We choose the parameter 
setup having the autocorrelation plot within the boundaries of 95% confidence 
interval and minimum AIC value and minimum KS distance. Once the order of linear 
and nonlinear regressions is determined, the initial NARI coefficients are estimated 
by the method of least squares'^. 

Estimation of the input-output voherra kernels. The M"'-order spectral 
representations are related to the the Volterra series expansion and the Volterra 
theorem^*. In functional analysis, a Volterra series denotes a functional expansion of a 
dynamic, nonlinear, and time -invariant function, widely used in nonlinear 
physiological modeling^^'^^'^". The quadratic NARI model can be linked to the 
traditional input-output Volterra models by using a specific relationships^" between 
the Fourier transforms of the Volterra kernels of order p, Hpifi, . and the Fourier 
transforms of the extended NAR kernels, Ti (fi ) and r2 (fi J2 ) ■ In general, a second- 
order NARI model has to be mapped into a infinite-order input-output Volterra 
moder", although there is the need to truncate the series to a reasonable order for 
actual application. Here, we choose to model the cardiovascular activity with a cubic 
input-output Volterra by means of the following relationships with the NARI: 



Hi{f) = 



ri(/) 



r'i(fi)r;(f2) 



(10) 



(11) 



X exp 



2n{t — Uj 
2 ^l^^{uH,,Q{t)f(t-''i) 



(7) 



(/<73(l))ri(/,3(2)) 
X -H2 (frf(l) +/ti3(2) ,/<73(3) ) • 



(12) 



withy — JV(f) the index of the previous R-wave event before time 

tji^ ^ (uy,RR^,RR^_i, RR^_M+i) is the history of events, Q{t) the vector of the 

time-varing parameters, }.i^{tJ-Li^c{t)) the first-moment statistic (mean) of the 
distribution, and i^o(0 > 0 the shape parameter of the inverse Gaussian distribution. 



Quantitative tools: high order spectral analysis. Our framework allows for three 
levels of quantitative characterization of heartbeat dynamics: instantaneous time- 
domain estimation, linear power spectrum estimation, and higher order spectral 
representation. The time-domain characterization is based on the first and the second 
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order moments of the underlying probability structure. Namely, given the time- 
varying parameter set ^{t), the instantaneous estimates of mean R-R, R-R interval 
standard deviation, mean heart rate, and heart rate standard deviation can be 
extracted at each moment in time^^. The linear power spectrum estimation reveals the 
linear mechanisms governing the heartbeat dynamics in the frequency domain. In 
particular, given the input-output Volterra kernels of the NARI model for the 
instantaneous R-R interval mean f.tj^^{t,Ht,c{t)), we can compute the time- varying 
parametric (linear) autospectrum^' of the derivative series: 



Q(J,t) = S„(J,t)H,(J,t)H,{-f,t) 
3 



271 



(13) 



where 5^ (/,f) — o"^. The time- varying parametric autospectrum of the R-R intervals 
is given by multiplying its derivative spectrum Q(f,t) by the quantity 2(1 — cos{oj)y^. 
Importantly, previous derivations of the expressions for the autospectrum*"'^^ were 
possible because the first- and second-order Volterra operators are orthogonal to each 
other for Gaussian inputs. This property does not hold for orders greater than two^\ 
and in cubic nonlinear input-output Volterra systems the autospectrum is estimated 
by considering also the third order term. By integrating (13) in each frequency band, 
we can compute the index within the very low frequency (VLF — 0.01-0.05 Hz), low 
frequency (LF — 0.05-0.15 Hz), and high frequency (HF — 0.15-0.5 Hz) ranges. The 
higher order spectral representation allows for consideration of statistics beyond the 
second order, and phase relations between frequency components otherwise 
suppressed^^'^^. Higher order spectra (HOS), also known as polyspectra, are spectral 
representations of higher order statistics, i.e. moments and cumulants of third order 
and beyond. HOS can detect deviations from linearity, stationarity or Gaussianity. 
HOS analysis has been proven to be effective in several HRV applications (e.g. 
arrhythmia recognition^^'^^). Particular cases of higher order spectra are the third- 
order spectrum (Bispectrum) and the fourth-order spectrum (Trispectrum)^^, 
defined from the Volterra kernel coefficients estimated within the point process 
framework. 

Dynamical bispectrum estimation. Let H2(fi,f2, t) denote the Fourier transform of 
the second-order Volterra kernel coefficients. The cross-bispectrum (Fourier 
transform of the third-order moment) is^^'^'': 



CrossBis(fi/2,0-25,,(/i,0 

>5..(/2,0H2(-/i,-/2,t) 



(14) 



(15) 



where Sxxif,^) is the autospectrum of the input (i.e. o\^. Note that we use the 
approximation shown in eq. 14 since the equality only strictly holds when the input 
variables are jointly Gaussian. The analytical solution for the bispectrum of a 
nonlinear system response with stationary, zero-mean Gaussian input is^^: 

Bis(/i/2,0-2H2Cfi-F/2,-/2,0 

H,(-/i-/2,0Hi(/2,f) 
X S^if, +f2,t)S,,{f2,t)-\-2H2{fl +/2,-/l,0 

xHi{-fi-f2,t)Hi{fiJ)S^{fi^f2j)S^,(fut) 
-\-2H2{-fi.-f2,t)H,(fi.t)H,(f2,t) 
xS,,{fut)S,,{f2j) 

Of note, an expression similar to 15 was derived in the appendix of^*. The dynamic 
bispectrum is an important tool for evaluating the instantaneous presence of non- 
linearity in time series^^'^^'^^. The bispectrum presents several symmetry properties'^ 
and for a real signal it is uniquely defined by its values in the triangular region of 
computation Q, 0 </i ^/i +/2 — 1^^- Within this region, it is possible to estimate 
several features, which are summarized in Table 1 of the supporting information. The 
bispectral invariant'^, P{a, t) represent the phase of the integrated bispectrum along 
the radial line with the slope equal to a (with mean P{a,t) and variance o"p(„,f)). The 
parameter 0 < a :^ 1 is the slope of the straight line on which the bispectrum is 
integrated. The variables IX^, t) and /,(«, f) refer to the real and imaginary part of the 
integrated bispectrum, respectively. The Mean magnitude Mi„ean(0 snd the phase 
entropy Pe(f)^^ have « = 0, 1, ...,N— 1 ; L as the number of points within the region 
<1) the phase angle of the bispectrum, and 1(.) the indicator function which gives a 
value of 1 when the phase angle <1) is within the range of bin T„. The normalized 
bispectral entropy (Pi(0)^^ a-nd the normalized bispectral squared entropy iP2{t)T^ are 
also considered (between 0 and 1 ) along with the sum of logarithmic amplitudes of the 
bispectrum^^. As the sympatho -vagal linear effects on HRV are mainly characterized 
by the LF and HF spectral powers^*-^"'^^"^*, we evaluate the nonlinerar sympatho -vagal 
interactions by integrating |B(/i,/2)| in the bidimensional combination of frequency 
bands. LL{t),LH{t), and HH(0 (see Supporting Information for a graphical 
representation in the bispectral plane). 



Dynamical trispectrum estimation. BrUlinger^^, Billings^*, Priestley^^, and others 
have demonstrated that there is a closed form solution for homogeneous systems with 
Gaussian inputs. Thus, the transfer function of a m-order homogeneous system is 
estimated by the relation: 



H,„ /!,.../„, 



-In.) 



m^-S,x(fl) ■ ■ ■ Sxxlfm) 



(16) 



where the numerator is the m + 1 — n'^order crosspolyspectrum between y and x. 
This result is a generalization of the classical result for the transfer function of a linear 
system resulting for m — 1 . Therefore, the cross-trispectrum (Fourier transform of the 
third-order moment) can be estimated as: 



r(f,,/2/3,t)«3!5„(fi,()5„(/2,t)5„(/-3,() 

XHiifuf2,fi,t) 



(17) 



Dominant lyapunov exponent estimation. Cubic nonlinearities, in the framework 
of the proposed NARI point-process model, account for the novel estimation of the 
instantaneous Lyapunov spectrum. By definition, the generic Lyapunov exponent 
(LE) of a real valued function /(f) defined for f > 0 is defined as: 



A-lim sup-log([/'(0|) 

(^o::. t 



(18) 



Let us consider a generic tt-dimensional linear system in the form — Y{t) pi, where 
Y{t) is a time-varying fundamental solution matrix with Y (0) orthogonal, and {p,} is 
an orthonormal basis of IR". The key theoretical tools for determining the IDLE and 
the whole spectrum of LEs is the continuous QR factorization of Y {ty^-^^: 



Y{t) = Q{t)R{t) 



(19) 



where Q(f) is orthogonal and R{t) is upper triangular with positive diagonal elements 
Rij, 1 < f < n. Then, LEs are formulated as: 



A,-- lim-log||y(Op,-|| 

t-xX) t 

= limilog||R(Op;||= liraylogpiKOII- 



(20) 



The cubic NAR model (eq. 8) can be rewritten in an M-dimensional state space 
canonical representation: 



'ii-l 



F r 



(M) (M-1) 



if k<M 
if k = M 



(21) 



By evaluating the Jacobian /(«) over the time series, which corresponds to the matrix 
Y (t), the LE can be determined using the QR decomposition: 



/(n)Q(«-i)=Qw-R(„) 



(22) 



This decomposition is unique except in the case of zero diagonal elements. Then the 
LEs are given by 



. H-l 

,, = — VlnJ 



(23) 



where H is the available number of matrices within the local likelihood window of 
duration W, and t the sampling time step. The estimation of the LEs is performed at 
each time t from the corresponding time-varying vector of parameters, QitY''. Here, 
the first LE, Ai(t) is considered as the instantaneous dominant Lyapunov exponent 
(IDLE). 
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